Big Data, Decision Tree Induction, and Image Analysis for the Discovery of Decision Rules for Colon Examination

نویسنده

  • Petra Perner
چکیده

Abstract— The aim of our research was to develop a method that allows us automatically to discover the decision rules for diagnosing medical images in normal tissue images and images showing a polyp. We used a data set of images that came from an endoscope video system used for colon examination. The data set contains 283 normal tissue images and 61 polyp images. The 283 normal images consist of dark regions and reflection. One must decide if the image shows a polyp or not. This is a two-class problem. The unequal number of the data in the two classes makes our problem to an unbalanced data set problem. The polyps in the images were identified and selected by a “well-trained” medical expert. Based on these medical images, we study the behavior of two different statistical texture descriptors, the co-occurrence matrix-texture descriptor and our novel Random set texture descriptor. We review the theory of both texture descriptors and then we apply them to our medical data set. We used a decision-tree induction method to learn the classification rules based on our tool “Decision Master”. In both cases, for the full unequally distributed data set and for the balanced data set, we achieved the best error rate based the Random-set texture descriptor. The performance of the co-occurrence matrix-texture descriptor was worse. For statistical based texture descriptors large enough texture are necessary that cannot always guaranteed for medical objects. Since the co-occurrence matrix is based on higher order statistic that might be the reason for the worse performance. The results show that decision tree induction and image analysis based on our novel texture descriptor is an excellent method to mine medical images for the decision rules even when the data set is unbalanced, but not only that makes our Random-set based texture descriptor favorable. It also gives a flexible way to describe the appearance of the medical objects in symbolic terms, the computation time is less, and it can be set up as software module that can be flexible used in different systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION

Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...

متن کامل

Land Cover Classification Using IRS-1D Data and a Decision Tree Classifier

Land cover is one of basic data layers in geographic information system for physical planning and environmentalmonitoring. Digital image classification is generally performed to produce land cover maps from remote sensing data,particularly for large areas. In the present study the multispectral image from IRS LISS-III image along with ancillary datasuch as vegetation indices, principal componen...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

ارزیابی عملکرد واحدهای تصمیم‌گیرنده با استفاده از تحلیل پوششی داده‌های پنجره‌ای و درخت تصمیم

Efficiency is an issue of importance and interest to both managers of different organizations and customers who use the products and services of these organizations. The aim of this research is to study the efficiency of pharmaceutical companies accepted in the Stock Exchange Organization by using Window Data Envelopment Analysis (WDEA) and then, to provide some rules based on the decision tree...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017